Robust k-means: a Theoretical Revisit
نویسنده
چکیده
Over the last years, many variations of the quadratic k-means clustering procedure have been proposed, all aiming to robustify the performance of the algorithm in the presence of outliers. In general terms, two main approaches have been developed: one based on penalized regularization methods, and one based on trimming functions. In this work, we present a theoretical analysis of the robustness and consistency properties of a variant of the classical quadratic k-means algorithm, the robust k-means, which borrows ideas from outlier detection in regression. We show that two outliers in a dataset are enough to breakdown this clustering procedure. However, if we focus on “well-structured” datasets, then robust k-means can recover the underlying cluster structure in spite of the outliers. Finally, we show that, with slight modifications, the most general non-asymptotic results for consistency of quadratic k-means remain valid for this robust variant.
منابع مشابه
Utilizing Robust Data Envelopment Analysis Model for Measuring Efficiency of Stock, A case study: Tehran Stock Exchange
Uncertainty is a prominent feature of real world problems and more especially financialmarkets; with this in mind, dealing with uncertainty becomes a necessary part of performanceevaluation by means of data envelopment analysis. This paper presents three robust dataenvelopment analysis (DEA) models and their application for performance evaluation inTehran Stock Exchange (TSE). Based on the resu...
متن کاملModification of the Fast Global K-means Using a Fuzzy Relation with Application in Microarray Data Analysis
Recognizing genes with distinctive expression levels can help in prevention, diagnosis and treatment of the diseases at the genomic level. In this paper, fast Global k-means (fast GKM) is developed for clustering the gene expression datasets. Fast GKM is a significant improvement of the k-means clustering method. It is an incremental clustering method which starts with one cluster. Iteratively ...
متن کاملA robust approach to multi period covering location-allocation problem in pharmaceutical supply chain
This paper proposes a discrete capacitated covering location-allocation model for pharmaceutical centers. In the presented model, two objectives are considered; the first one is minimization of costs and the second one try to maximize customer satisfaction by definition of social justice. Social justice in the model means that we consider customers satisfaction by using distance. the introduced...
متن کاملImpacts of the Negative-exponential and the K-distribution modeled FSO turbulent links on the theoretical and simulated performance of the distributed diffusion networks
Merging the adaptive networks with the free space optical (FSO) communication technology is a very interesting field of research because by adding the benefits of this technology, the adaptive networks become more efficient, cheap and secure. This is due to the fact that FSO communication uses unregistered visible light bandwidth instead of the overused radio spectrum. However, in spite of all ...
متن کاملSaturated Neural Adaptive Robust Output Feedback Control of Robot Manipulators:An Experimental Comparative Study
In this study, an observer-based tracking controller is proposed and evaluatedexperimentally to solve the trajectory tracking problem of robotic manipulators with the torque saturationin the presence of model uncertainties and external disturbances. In comparison with the state-of-the-artobserver-based controllers in the literature, this paper introduces a saturated observer-based controllerbas...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016